Skip to content

[WIP] Change Event Based Governance Workflow#26758

Open
yan-3005 wants to merge 8 commits intoram/workflow-improvementsfrom
ram/workflow-improvements-change-event-offset
Open

[WIP] Change Event Based Governance Workflow#26758
yan-3005 wants to merge 8 commits intoram/workflow-improvementsfrom
ram/workflow-improvements-change-event-offset

Conversation

@yan-3005
Copy link
Copy Markdown
Contributor

@yan-3005 yan-3005 commented Mar 25, 2026

Describe your changes:

Fixes https://github.com/open-metadata/openmetadata-collate/issues/3364

Type of change:

  • Bug fix
  • Improvement
  • New feature
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation

Checklist:

  • I have read the CONTRIBUTING document.
  • My PR title is Fixes <issue-number>: <short explanation>
  • I have commented on my code, particularly in hard-to-understand areas.
  • For JSON Schema changes: I updated the migration scripts or explained why it is not needed.

Summary by Gitar

  • Change-event-driven workflow triggers:
    • Replaced FetchEntitiesImpl with FetchChangeEventsImpl for incremental offset-based change event processing
    • Added CommitChangeEventOffsetImpl to persist consumed offsets per workflow and entity type
  • Database schema changes:
    • Added composite index on change_event(entityType, offset) for efficient filtering
    • Widened change_event_consumers.id from VARCHAR(36) to VARCHAR(768) for workflow consumer IDs
  • Trigger workflow updates:
    • Modified PeriodicBatchEntityTrigger to fetch change events instead of searching; added workflow FQN parameter
    • Updated trigger schema to make filters optional
  • Tests and integration:
    • Enhanced WorkflowDefinitionResourceIT with multi-batch change event test flow

This will update automatically on new commits.

yan-3005 and others added 2 commits March 24, 2026 12:49
…ted task nodes

Phase 1 of batch entity processing in governance workflows. All
automated task nodes (checkEntityAttributesTask, setEntityAttributeTask,
checkChangeDescriptionTask, rollbackEntityTask, sinkTask,
dataCompletenessTask) now process a List<String> of entity links via
entityList exclusively. The relatedEntity fallback in getEntityList()
is removed — batch nodes no longer have that path.

Key changes:

- PeriodicBatchEntityTrigger (singleExecutionMode=false): each child
  process now receives global_entityList via ${entityToListMap[relatedEntity]}.
  FetchEntitiesImpl pre-builds entityToListMap (entity -> List.of(entity))
  so the JUEL expression resolves without static class references.

- Batch node impls (6 files): removed relatedEntity fallback from
  getEntityList() and the now-unused RELATED_ENTITY_VARIABLE import.
  entityList is the only input path.

- BPMN builders: putIfAbsent(ENTITY_LIST_VARIABLE, GLOBAL_NAMESPACE) for
  all batch-capable nodes; relatedEntity is never added to inputNamespaceMap
  by builders.

- v1130 migration (addEntityListToNamespaceMap): updated to also strip
  relatedEntity from batch node inputNamespaceMaps, covering both fresh
  upgrades (add entityList + remove relatedEntity) and instances that
  already ran the previous migration (entityList present, remove relatedEntity).
  Migration remains idempotent.

- GlossaryApprovalWorkflow.json: removed relatedEntity from all 13 batch
  node inputNamespaceMaps. userApprovalTask nodes keep relatedEntity.

- JSON schemas: entityList added to trigger output and batch node
  inputNamespaceMap/input definitions.

- Integration tests: updated to reflect entityList-first structure across
  all batch-capable workflow node configurations.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

Hi there 👋 Thanks for your contribution!

The OpenMetadata team will review the PR shortly! Once it has been labeled as safe to test, the CI workflows
will start executing and we'll be able to make sure everything is working as expected.

Let us know if you need any help!

Comment on lines +136 to +145
private String resolveEntityType(DelegateExecution execution) {
String processDefinitionKey = execution.getProcessDefinitionId().split(":")[0];
if (processDefinitionKey.contains("-")) {
String[] parts = processDefinitionKey.split("-");
if (parts.length >= 2) {
return parts[parts.length - 1];
}
}
return (String) entityTypesExpr.getValue(execution);
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Bug: resolveEntityType falls back to expression on ambiguous key

In FetchChangeEventsImpl.resolveEntityType() (lines 136-145), the process definition key is split on - and the last segment is used as the entity type. If the process key contains - but has fewer than 2 parts after splitting (which can't actually happen with split), or if the entity type itself contains a -, the wrong value will be extracted. For example, an entity type like api-endpoint would be parsed as just endpoint.

Was this helpful? React with 👍 / 👎 | Reply gitar fix to apply this suggestion

…ts-change-event-offset

- Keep two-phase migration in MigrationUtil (raw JSON Phase 1, BPMN redeploy Phase 2)
- Keep workflowFqn parameter in PeriodicBatchEntityTrigger constructor
- Keep FetchChangeEventsImpl as primary fetch task (no FetchEntitiesImpl)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

Hi there 👋 Thanks for your contribution!

The OpenMetadata team will review the PR shortly! Once it has been labeled as safe to test, the CI workflows
will start executing and we'll be able to make sure everything is working as expected.

Let us know if you need any help!

@yan-3005 yan-3005 self-assigned this Mar 27, 2026
Combines both changes:
- hasBatchModeNodes(workflow) for singleExecutionMode (from base)
- workflow.getFullyQualifiedName() for workflowFqn (from PR branch)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

Hi there 👋 Thanks for your contribution!

The OpenMetadata team will review the PR shortly! Once it has been labeled as safe to test, the CI workflows
will start executing and we'll be able to make sure everything is working as expected.

Let us know if you need any help!

@github-actions
Copy link
Copy Markdown
Contributor

Hi there 👋 Thanks for your contribution!

The OpenMetadata team will review the PR shortly! Once it has been labeled as safe to test, the CI workflows
will start executing and we'll be able to make sure everything is working as expected.

Let us know if you need any help!

@yan-3005 yan-3005 added the safe to test Add this label to run secure Github workflows on PRs label Mar 27, 2026
@github-actions
Copy link
Copy Markdown
Contributor

✅ TypeScript Types Auto-Updated

The generated TypeScript types have been automatically updated based on JSON schema changes in this PR.

@github-actions github-actions bot requested a review from a team as a code owner March 27, 2026 15:07
@gitar-bot
Copy link
Copy Markdown

gitar-bot bot commented Mar 27, 2026

Code Review 👍 Approved with suggestions 2 resolved / 3 findings

Refactors governance workflows to process change events with batch entity handling for automated task nodes. Resolved schema validation and batch sink execution issues; consider handling ambiguous entity type resolution more explicitly in FetchChangeEventsImpl.

💡 Bug: resolveEntityType falls back to expression on ambiguous key

📄 openmetadata-service/src/main/java/org/openmetadata/service/governance/workflows/elements/triggers/impl/FetchChangeEventsImpl.java:136-145

In FetchChangeEventsImpl.resolveEntityType() (lines 136-145), the process definition key is split on - and the last segment is used as the entity type. If the process key contains - but has fewer than 2 parts after splitting (which can't actually happen with split), or if the entity type itself contains a -, the wrong value will be extracted. For example, an entity type like api-endpoint would be parsed as just endpoint.

✅ 2 resolved
Bug: Schema maxItems:2 rejects 3-element output arrays used in tests

📄 openmetadata-spec/src/main/resources/json/schema/governance/workflows/elements/triggers/periodicBatchEntityTrigger.json:97 📄 openmetadata-integration-tests/src/test/java/org/openmetadata/it/tests/WorkflowDefinitionResourceIT.java:2335 📄 openmetadata-integration-tests/src/test/java/org/openmetadata/it/tests/WorkflowDefinitionResourceIT.java:2435 📄 openmetadata-integration-tests/src/test/java/org/openmetadata/it/tests/WorkflowDefinitionResourceIT.java:2619 📄 openmetadata-integration-tests/src/test/java/org/openmetadata/it/tests/WorkflowDefinitionResourceIT.java:3095 📄 openmetadata-integration-tests/src/test/java/org/openmetadata/it/tests/WorkflowDefinitionResourceIT.java:3969
The periodicBatchEntityTrigger.json schema changed maxItems from 1 to 2, but integration tests (and likely real workflow definitions) use 3-element output arrays like ["relatedEntity", "entityList", "updatedBy"]. With maxItems: 2, these will fail JSON Schema validation.

There are 12+ instances in WorkflowDefinitionResourceIT.java with 3-item outputs. The schema constraint must be raised to at least 3, or removed entirely if the output list is open-ended.

Bug: singleExecutionMode hardcoded to false, batch sink broken

📄 openmetadata-service/src/main/java/org/openmetadata/service/governance/workflows/elements/TriggerFactory.java:27
In TriggerFactory.java, singleExecutionMode is now hardcoded to false (line 27), removing the dynamic hasBatchModeNodes() check that previously detected sink tasks needing batch processing. This means workflows with batch-mode sink nodes will now always spawn N parallel instances instead of a single instance processing all entities, breaking the batch sink use case.

The PeriodicBatchEntityTrigger constructor still accepts and uses the parameter (controlling cardinality and input mapping), but the factory never passes true. Tests still test both modes, masking this gap.

🤖 Prompt for agents
Code Review: Refactors governance workflows to process change events with batch entity handling for automated task nodes. Resolved schema validation and batch sink execution issues; consider handling ambiguous entity type resolution more explicitly in FetchChangeEventsImpl.

1. 💡 Bug: resolveEntityType falls back to expression on ambiguous key
   Files: openmetadata-service/src/main/java/org/openmetadata/service/governance/workflows/elements/triggers/impl/FetchChangeEventsImpl.java:136-145

   In `FetchChangeEventsImpl.resolveEntityType()` (lines 136-145), the process definition key is split on `-` and the last segment is used as the entity type. If the process key contains `-` but has fewer than 2 parts after splitting (which can't actually happen with split), or if the entity type itself contains a `-`, the wrong value will be extracted. For example, an entity type like `api-endpoint` would be parsed as just `endpoint`.

Options

Auto-apply is off → Gitar will not commit updates to this branch.
Display: compact → Showing less information.

Comment with these commands to change:

Auto-apply Compact
gitar auto-apply:on         
gitar display:verbose         

Was this helpful? React with 👍 / 👎 | Gitar

@github-actions
Copy link
Copy Markdown
Contributor

Jest test Coverage

UI tests summary

Lines Statements Branches Functions
Coverage: 64%
64.84% (58138/89663) 44.68% (30728/68763) 47.68% (9201/19296)

@sonarqubecloud
Copy link
Copy Markdown

@sonarqubecloud
Copy link
Copy Markdown

@github-actions
Copy link
Copy Markdown
Contributor

🟡 Playwright Results — all passed (18 flaky)

✅ 3398 passed · ❌ 0 failed · 🟡 18 flaky · ⏭️ 216 skipped

Shard Passed Failed Flaky Skipped
🟡 Shard 1 453 0 2 2
🟡 Shard 2 603 0 1 32
🟡 Shard 3 603 0 6 27
🟡 Shard 4 598 0 5 47
✅ Shard 5 587 0 0 67
🟡 Shard 6 554 0 4 41
🟡 18 flaky test(s) (passed on retry)
  • Flow/Tour.spec.ts › Tour should work from help section (shard 1, 1 retry)
  • Pages/UserCreationWithPersona.spec.ts › Create user with persona and verify on profile (shard 1, 1 retry)
  • Features/BulkEditEntity.spec.ts › Glossary (shard 2, 1 retry)
  • Features/Permissions/GlossaryPermissions.spec.ts › Team-based permissions work correctly (shard 3, 1 retry)
  • Features/SampleDataTableOperations.spec.ts › should download sample data as CSV when export is clicked (shard 3, 1 retry)
  • Flow/AddRoleAndAssignToUser.spec.ts › Verify assigned role to new user (shard 3, 1 retry)
  • Flow/ExploreDiscovery.spec.ts › Should display deleted assets when showDeleted is checked and deleted is not present in queryFilter (shard 3, 1 retry)
  • Flow/PersonaFlow.spec.ts › Set and remove default persona should work properly (shard 3, 1 retry)
  • Pages/Customproperties-part2.spec.ts › Entity Reference (shard 3, 1 retry)
  • Pages/Customproperties-part2.spec.ts › entityReferenceList shows item count, scrollable list, no expand toggle (shard 4, 1 retry)
  • Pages/DataContracts.spec.ts › Create Data Contract and validate for DashboardDataModel (shard 4, 1 retry)
  • Pages/Domains.spec.ts › Rename domain with assets (tables, topics, dashboards) preserves associations (shard 4, 1 retry)
  • Pages/Domains.spec.ts › Multiple consecutive domain renames preserve all associations (shard 4, 1 retry)
  • Pages/Entity.spec.ts › Tag Add, Update and Remove (shard 4, 1 retry)
  • Pages/ODCSImportExport.spec.ts › Multi-object ODCS contract - object selector shows all schema objects (shard 6, 1 retry)
  • Pages/Users.spec.ts › Permissions for table details page for Data Consumer (shard 6, 1 retry)
  • Pages/Users.spec.ts › Check permissions for Data Steward (shard 6, 1 retry)
  • VersionPages/EntityVersionPages.spec.ts › Directory (shard 6, 1 retry)

📦 Download artifacts

How to debug locally
# Download playwright-test-results-<shard> artifact and unzip
npx playwright show-trace path/to/trace.zip    # view trace

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

safe to test Add this label to run secure Github workflows on PRs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant